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ABSTRACT 

In  ocean  surveillance,  a  number  of  different  types  of 
transient  signals  are  observed.  These  sonar  signals 
are  waveforms  in  one  dimension  (1-D),  and  often 
display  an  evolutionary  pattern  over  the  time  scale. 
The  hidden  Markov  model  (HMM)  is  well-suited  to 
classification  of  such  1-D  signals.  Following  this  in¬ 
tuition.  the  tq)plication  of  HMM  to  sonar  transient 
classification  is  proposed  and  discussed  in  this  paper. 
Tbward  this  goal,  three  different  feature  vectors 
based  on  autoregressive  (AR)  model,  Fouxia  power 
spectrtun,  and  wavelet  mmsftnrms  are  considered  in 
our  woiir.  The  neural  net  (NN)  classifier  has  been 
successfully  used  for  sonar  transient  classification. 
The  same  set  of  features  as  mentioned  above  is  then 
used  with  an  NN  classifier.  Some  concrete  experi¬ 
mental  results  using  “DARPA  standard  data  set  F’ 
with  HMM  and  NN  classification  schemes  are  pre¬ 
sented.  Finally,  a  combined  NN/HMM  classifier  is 
proposed,  and  its  performance  is  evaluated  with  re¬ 
spect  to  individual  classifiers. 

1.  INTRODUCTION 

The  transient  sonar  signal  classification  problem 
is  deemed  difficult  because  of  the  short  duration  of 
the  transients,  wide  intra-class  variations  and  the  ef¬ 
fects  of  ambient  ocean  noise.  The  most  common  type 
of  dassifin-  used  fix'  this  task  is  the  neural  net  [1] 
though  other  classifiers  have  been  smdied  [1.  4-5]. 
Also,  it  has  been  found  that  no  single  feature  extrac¬ 
tion  technique  can  adequately  capture  all  the  feature 
information  for  all  the  ocean  acoustic  transients  of  in¬ 
terest  With  this  view  in  nund,  we  have  experimented 
with  the  HMM  classifier  and  three  different  feature 
vectors  in  this  ptq)er.  The  feature  vector  based  on  an 
^  model  is  a  natural  candidate  with  the  HMM  clas¬ 
sifier.  As  tlK  Fourier  power  spectrum  is  widely  used 
by  the  NN  community  for  thdr  research,  the^  fea¬ 
tures  are  also  considered  [!]•  Rnally,  wavelet-trans¬ 
form-based  features  are  considered.  It  is  well-known 
that  sonar  transients  are  nonstationary  signals.  The 
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wavelet  transform  can  pr(^)erly  rq)resent  such  sig¬ 
nals.  In  particular.  Daubechies  type  wavelets  are  con¬ 
sidered  in  our  wcx’k.  These  wavelets  are  finite 
duration  filters  and  quite  easy  to  implement  [2].  It  is 
our  viewpoint  that  these  three  very  different  signal 
rq)resentations  for  feature  extraction  would  reveal 
some  of  the  latent  characteristics  of  the  signal  for  bet¬ 
ter  classification. 

Finally,  we  have  studied  the  same  set  of  features 
with  a  nutiti-layer  perceptron  neural  net  (MLP-NN) 
dassifia  with  the  express  objective  of  finding  out  the 
oomplementaiy  nature,  if  any.  of  these  two  classifiers 
-  MLP-NN  and  HMM.  We  show  in  the  current  paper 
that  a  combined  classifier  using  HMMs  and  MLP- 
NNs  is  likely  to  outperform  the  individual  classifiers. 
Figure  1  gives  the  block  diagram  of  our  scheme. 

2.  FEATURE  REPRESENTATION 

We  have  three  different  feature  representation 
schemes:  one  based  on  an  autoregressive  model,  one 
based  on  Fourier  power  spectrum,  and  the  other 
based  on  the  wavelet  transform.  The  AR  coefiSdents 
are  computed  by  the  Burg  algorithm.  Due  to  scaling 
problem,  the  gain  coefficient  is  not  used. 

2.1.  Fourier  Power  Spectrum 

Ftom  the  given  data  segment,  its  FFT  is  comput¬ 
ed.  Before  FFT  computation,  each  data  segment  is 
windowed  with  a  Kaiser-Bessel  window  function. 
The  magnimde  square  of  the  FFT  coefficients  gives 
the  Fourier  power  spectrum  of  the  data. 

2.2.  Wavelet  IVansform 

The  Daubechies  wavelets  are  a  class  of  discrete 
ortiumormal  dyadic  wavelets.  An  M  order 
Daubechies  wavelet  [2]  is  given  by  M  coeffidents 
denoted  by  C^j  -  0, ....  w- 1.  Then,  the  convolution 
of  the  signal  with  a  FIR  filter  of  length  M 
(C|,y-  0, .... W- 1)  gives  the  smooth  component  On 
the  other  hand,  the  convolution  of  the  signal  with  a 
FIR  filter  of  length  M  and  coeffidents 


1  _yi  -  0 _ Af-  1 ,  gives  the  detail  compo¬ 

nent.  After  one  pass  of  this  algorithm,  the  smooth 
and  the  detail  components  are  decimated  by  2.  The 
smooth  components  are  then  transformed  again,  and 
the  procedure  continues  until  we  have  only  two 
smooth  components  left.  The  output,  at  this  stage,  is 
the  wavelet  transform  of  the  original  signal.  The  co¬ 
efficients  in  Daubechies  wavelets  are  obtained  firom 
orthonormality  conditions,  and  “smoothness  con¬ 
straints”.  For  an  M  order  wavelet,  these  conditions 
and  constraints  lead  to  exactly  M  linear  equations. 
Thus.  Af  coefficients  are  uniquely  determined  [2]. 

2.3.  Feature  Selection 

The  feature  representauon  schemes  transform  the 
original  signal  into  feature  space.  Since  some  fea¬ 
tures  may  be  more  useful  than  others,  only  the  impor¬ 
tant  features  should  be  selected  for  a  compaa 
representation  of  the  signal  for  classification  purpose. 
In  our  scheme,  the  signal  is  divided  into  a  number  of 
overlapping  segments.  All  the  AR  coefficients  are 
taken  as  the  feature  veaor  since  relatively  few  AR 
coefficients  are  needed  to  represent  a  segment.  For 
FFT  power  spectrum  and  the  wavelet  transform,  the 
spectra  and  the  transform  coefficients  with  relatively 
higher  magnitude  are  selected  as  features. 

3.  CLASSIFIER  DESIGN 

In  our  work,  we  have  used  two  classifiers:  HMM 
and  multi-layer  peiceptron  NN.  Each  signal  template, 
i.e.,  exemplar,  is  divided  into  a  sequence  of  partially 
overlapping  segments.  Each  segment  is  then  repre¬ 
sented  ^  one  feature  vector.  The  sequence  of  feature 
vectors,  henceforth  denoted  as  O,  is  used  as  one 
trainingAesting  observation  sequence  for  the  HMM. 

3.1.  HMM  Classifier 

To  solve  our  signal  classification  problem,  we 
aeate  one  HMM  for  each  class.  The  observation 
density  in  each  state  of  the  HMM  is  assumed  to  be 
multi-ffimensional  Gaussian.  For  a  classifier  of  P 
classes,  we  denote  the  P  models  by  x  ,p  -  i,  2, 
When  a  signal  0  of  unknown  class  is  given,  we  cal¬ 
culate 

P*  -  arg  max  P(O.Q*l  A.  ) 

P 

and  classify  the  signal  as  belonging  to  class  p* .  Here. 
<2*  represents  the  optimal  state  sequence  correspond¬ 
ing  to  0  [3].  For  a  given  X.  an  efficient  method  to 
find  p(0,  Q*|  X)  is  the  well  known- Viterbi  algorithm 
[3]. 


In  creating  the  model  for  each  class,  we  should 
guarantee  that  the  parameters  we  obtain  are  the  opti¬ 
mum  for  a  given  set  of  training  samples.  Since  our 
decision  rule  is  the  state-optimized  likelihood  func¬ 
tion.  it  requires  that  the  estimated  parameter  x  be 
^ch  that  p  (0,  Q*|  X)  is  maximized  over  all  possible 
X  fcx  the  training  set.  It  is  shown  in  [6]  that  the  seg¬ 
mental  K-means  algorithm  converges  to  the  state-op¬ 
timized  likelihood  function  for  a  wide  range  of 
observation  density  functions,  including  the  Gauss¬ 
ian  density  we  have  assumed. 

In  our  works,  a  fully  connected  HMM  topology 
is  used.  For  the  dataset  used  in  our  experiment,  the 
fully  connected  HMM  topology  performs  consistent¬ 
ly  better  than  the  left-to-right  HMM  topology.  How¬ 
ever.  there  are  sonar  signals  where  the  utility  of  left- 
to-right  HMM  topology  has  been  demonstrated  [4]. 

3.2.  Multi-Layer  Perceptron  NN  Classifier 

Multi-layer  percq)trons  (MLP)  are  feed-forward 
nets  with  one  or  more  layers  of  nodes  between  the  in¬ 
put  and  output  layers.  The  lowest  layer  is  the  input 
layer,  which  does  not  have  any  processing  capability. 
The  highest  layer  is  the  output  layer  and  any  layer 
between  the  input  layer  and  output  layer  is  called  the 
hidden  layer.  It  is  the  hidden  layer  that  provides  the 
MLP-NN  classifier  the  ability  to  aeate  highly  non¬ 
linear  decision  surfaces  for  betta  discriminative  abil¬ 
ity. 

Genaally.  the  multi-laya  perceptrons  are  trained 
with  the  error  back-propagation  (EBP)  algaithm  [7] 
which  is  an  iterative  gradient  algoithm  designed  to 
minimize  the  mean  square  error  (MSE)  between  the 
desired  output  yj  and  the  actual  output  y^.  Some¬ 
times.  a  momentum  term  is  also  included  in  the  train¬ 
ing  procedure.  In  our  scheme,  thae  are  21  thhty- 
dimensional  vectos  in  the  sequence.  These  630  fea¬ 
tures  are  used  as  training  features  for  the  NN.  The 
NN  is  then  designed  with  630  input  nodes,  one  hid¬ 
den  laya  with  20  nodes,  and  trained  using  the  back 
propagation  algorithm  and  sigmoidal  nonlinearity. 

4.  EXPERIMENTAL  RESULTS 
4.1.  Signal  Description 

We  have  used  the  DARPA  standard  data  set  I  for 
our  experiments.  This  data  se(  provides  seven  classes 
of  signals  to  test  our  algorithm.  A  typical  example, 
one  from  each  class,  is  shown  in  Fig.  2.  We  denote 
these  signal  classes  as: 

Class  A:  Broadband  13-misc.  pulse. 

Qass  B:  Two4-misc.  pulses,  27'misc.  separation. 
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riacc  C:  3-kHz  tonal.  10-misc.  duratioa 
D:  3-lcHz  tonal.  100-misc.  duration. 

E:  150-Hz  tonal.  1-scc.  duration. 

ru«  F:  250-Hz  tonal.  8-sec.  duratioa 
N:  Ocean  ambient  noise. 

We  have  created  45  templates,  i.e..  exemplars, 
for  each  class,  of  which  23  are  used  as  training  tem¬ 
plates  and  22  as  test  templates.  Each  signal  template 
contains  1024  data  pt^ts.  The  sampling  rate  for  the 
signal  is  24.576  Hz.  Fbr  this  sampling  rate.  1024  data 
points  are  enough  to  capture  the  essential  characteris¬ 
tics  of  all  the  transient  types  including  the  Qass  B 
type  signal,  which  has  the  most  time  spread.  This 
1024  point  signal  teo^)late  is  divided  into  21  frames. 
i.e.,  s^ments.  of  256  data  points  with  an  overlap  of 
218  points  (approximately  85%)  between  two  suc¬ 
cessive  frames.  Once  the  feature  vectors  are  comput¬ 
ed  from  each  frame,  the  signal  tenq>late  is 
rq>resented  by  a  feature  vectm  sequence. 

The  trainingAesting  sets  include  exemplars  from 
four  different  SNR  data  sets.  The  SNR  is  computed 
as  the  ratio  of  the  peak-signal-power  to  background- 
nc^-power  expressed  in  dB.  The  lowest  SNR  is  24 
dB  down  with  respect  to  the  highest  SNR.  As  a  re¬ 
sult.  some  very  noisy  exeiiq)lars  are  induded  in  our 
experiments.  We  have  tried  a  different  number  of 
states  for  HMM.  from  N  -  2to  N  m  12,  and  a  differ¬ 
ent  number  of  nodes,  from  10  to  30,  in  the  hidden 
layer  of  the  NN.  Only  the  best  results  are  reported  in 
tlK  paper  (Table  1).  The  results  related  to  AR  fea¬ 
tures  are  not  rqx)rted  as  they  are  far  inferior  to  those 
of  other  features. 

4.2.  Combined  Classifier 

Every  feature/classifier  combination  has  a  some¬ 
what  different  performance.  A  pertinent  question  is  - 
can  we  combine  the  evidence  of  all  the  feature/classi- 
fler  combinations  to  yield  results  that  would  be  supe¬ 
rior  to  any  specific  feature/classifier  combination? 
Such  a  combined  classifier  would  also  be  more  ro- 
^  bust  As  a  test,  we  have  devised  a  classifier,  hence- 
r,  forth  called  the  majority  classifier,  that  would  take 
tte  ouqnit  of  each  specific  feature/classifier  combina¬ 
tion  and  assign  the  test  exemplar  the  class  with  the 
^  ^jonty  votes  only  when  the  vote  exceeds  a  thresh- 
w.  Since  we  have  six  votes  per  test  exemplar,  we 
wx>se  a  threshold  of  3  and  4.  If  the  majority  vote  is 
,bdow  this  threshold,  that  test  exemplar  is  not  classi- 
*«.  When  the  threshold  is  4,  only  two  test  exem- 
are  misclassified,  but  12  are  not  classified, 
the  threshold  is  3,  five  test  exemplars  are  mis- 
■•■sstfied,  but  only  three  are  not  classified. 


Based  on  the  detailed  analysis  of  our  experimen¬ 
tal  results,  the  following  conclusions  are  in  order: 

(1)  To  a  certain  extent,  the  wavelet  based  features 
complement  the  FFT  based  features. 

(2)  To  a  certain  extent,  the  HMM  classifier  com¬ 
plements  the  NN  classifier. 

(3)  The  combined  classifier  has  the  best  result. 
Only  a  simple  combination  is  described  in  the  paper. 
(Xbtf  possible  combinations  of  HMM/NN  classifiers 
should  be  explored.  A  hybrid  HMM/NN  classifier 
that  combines  the  time  normalization  ability  of  the 
HMM  and  the  superlative  discriminative  ability  of 
the  NN  is  currently  being  investigated. 
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Hgure  2.  An  exanqile  of  different  classes  of  signals  used  in  our  expenment 


FFT 

Wavelet 
(Daubechies  4) 

Wavelet 
(Daubechies  20) 

Combined 

89.6% 

91.5% 

90.9% 

- 

NN 

90.9% 

90.9% 

935% 

- 

Combined 

(Tlireshold=4) 

- 

• 

- 

•98.6% 

Combined 

(Threshold's) 

- 

- 

- 

*96.1% 

Tablel.  Recognition  perfOTinance  of  classifier/feature  vector  combination;  *  indicates  that 
the  “non-classified”  templates  are  not  included  in  computing  tbe  recognition  performance. 
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